Multistage Markov Decision Processes with Minimum Criteria of Random Rewards

نویسندگان

  • Yoshio Ohtsubo
  • Y. Ohtsubo
چکیده

We consider multistage decision processes where criterion function is an expectation of minimum function. We formulate them as Markov decision processes with imbedded parameters. The policy depends upon a history including past imbedded parameters, and the rewards at each stage are random and depend upon current state, action and a next state. We then give an optimality equation by using operators and show that there exists a right continuous deterministic Markov policy, which depends upon a current state and an imbedded parameter.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Chapter for MARKOV DECISION PROCESSES

Mixed criteria are linear combinations of standard criteria which cannot be represented as standard criteria. Linear combinations of total discounted and average rewards as well as linear combinations of total discounted rewards are examples of mixed criteria. We discuss the structure of optimal policies and algorithms for their computation for problems with and without constraints.

متن کامل

Risk-Sensitive and Average Optimality in Markov Decision Processes

Abstract. This contribution is devoted to the risk-sensitive optimality criteria in finite state Markov Decision Processes. At first, we rederive necessary and sufficient conditions for average optimality of (classical) risk-neutral unichain models. This approach is then extended to the risk-sensitive case, i.e., when expectation of the stream of one-stage costs (or rewards) generated by a Mark...

متن کامل

Discounted approximations of undiscounted stochastic games and Markov decision processes are already poor in the almost deterministic case

It is shown that the discount factor needed to solve an undiscounted mean payoff stochastic game to optimality is exponentially close to 1, even in oneplayer games with a single random node and polynomially bounded rewards and transition probabilities. On the other hand, for the class of the so-called irreducible games with perfect information and a constant number of random nodes, we obtain a ...

متن کامل

Tucson Workshop on Computational and Behavioral Decision Making

s (In alphabetical order) Modeling Decisions in Complex and Risky Multistage Decision Processes Ronald G. Askin and Mengying Fu Human decision makers are known to be affected by factors such as anchoring, certainty preference, approach-avoidance conflicts, risk aversion, and framing. However the majority of previous research has studied behavior in relatively simple decision situations with onl...

متن کامل

A probabilistic analysis of bias optimality in unichain Markov decision processes

This paper focuses on bias optimality in unichain, nite state and action space Markov Decision Processes. Using relative value functions, we present new methods for evaluating optimal bias. This leads to a probabilistic analysis which transforms the original reward problem into a minimum average cost problem. The result is an explanation of how and why bias implicitly discounts future rewards.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006